Symmetric Eigenvalue Decomposition - Algorithms and Error Analysis

We study mainly algorithms for real symmetric matrices, which are most commonly used in the applications described in this course.

For more details, see I. Slapničar, Symmetric Matrix Eigenvalue Techniques and the references therein.

Prerequisites

The reader should be familiar with basic linear algebra concepts and facts on eigenvalue decomposition and perturbation theory

Competences

The reader should be able to apply adequate algorithm to a given problem, and to assess accuracy of the solution.

Backward error and stability

Definitions

If the value of a function $f(x)$ is computed with an algorithm $\mathrm{alg(x)}$, the algorithm error is

$$ \|\mathrm{alg(x)}-f(x)\|, $$

and the relative algorithm error is

$$ \frac{\| \mathrm{alg}(x)-f(x)\|}{\| f(x) \|}, $$

in respective norms. Therse errors can be hard or even impossible to estimate directly.

In this case, assume that $f(x)$ computed by $\mathrm{alg}(x)$ is equal to exact value of the function for a perturbed argument,

$$ \mathrm{alg}(x)=f(x+\delta x), $$

for some backward error $\delta x$.

Algoritam is stable is the above equality always holds for small $\delta x$.

Basic methods

Definitions

The eigenvalue decomposition (EVD) of a real symmetric matrix $A=[a_{ij}]$ is $A=U\Lambda U^T$, where $U$ is a $n\times n$ real orthonormal matrix, $U^TU=UU^T=I_n$, and $\Lambda=\mathop{\mathrm{diag}} (\lambda_1,\dots, \lambda_n)$ is a real diagonal matrix.

The numbers $\lambda_i$ are the eigenvalues of $A$, the vectors $U_{:i}$, $i=1,\dots,n$, are the eigenvectors of $A$, and $AU_{:i}=\lambda_i U_{:i}$, $i=1,\dots,n$.

If $|\lambda_1|> |\lambda_2| \geq \cdots \geq |\lambda_n|$, we say that $\lambda_1$ is the dominant eigenvalue.

Deflation is a process of reducing the size of the matrix whose EVD is to be determined, given that one eigenvector is known.

The shifted matrix of the matrix $A$ is the matrix $A-\mu I$, where $\mu$ is the shift.

Power method starts from unit vector $x_0$ and computes the sequences $$ \nu_k=x_k^T A x_k, \qquad x_{k+1}= A x_k / \| A x_k \|, \qquad k=0,1,2,\dots, $$ until convergence. Normalization of $x_k$ can be performed in any norm and serves the numerical stability of the algorithm (avoiding overflow or underflow).

Inverse iteration is the power method applied to the inverse of a shifted matrix: $$ \nu_k=x_k^T A x_k, \quad x_{k+1}= (A-\mu I)^{-1} x_k, \quad x_{k+1} = x_{k+1}/\|x_{k+1}\|, \quad k=0,1,2,\dots. $$

QR iteration starts from the matrix $A_0=A$ and forms the sequence of matrices $$ A_k=Q_kR_k \quad \textrm{(QR factorization)}, \qquad A_{k+1}=R_kQ_k,\qquad k=0,1,2,\ldots $$

Shifted QR iteration is the QR iteration applied to a shifted matrix: $$ A_k-\mu I=Q_kR_k \quad \textrm{(QR factorization)}, \quad A_{k+1}=R_kQ_k+\mu I ,\quad k=0,1,2,\ldots $$

Facts

  1. If $\lambda_1$ is the dominant eigenvalue, $x_0$ is not orthogonal to $U_{:1}$, and the norm is $2$-norm, then $\nu_k\to \lambda_1$ and $x_k\to U_{:1}$. In other words, the power method converges to the dominant eigenvalue and its eigenvector.

  2. The convergence is linear in the sense that $$ |\lambda_1-\nu_k|\approx \left|\frac{c_2}{c_1}\right| \left| \frac{\lambda_2}{\lambda_1}\right|^k,\qquad \|U_{:1}-x_k\|_2 =O\bigg(\bigg| \frac{\lambda_2}{\lambda_1}\bigg|^k\bigg)\!, $$ where $c_i$ is the coefficient of the $i$-th eigenvector in the linear combination expressing the starting vector $x_0$.

  3. Since $\lambda_1$ is not available, the convergence is determined using residuals: if $\| Ax_k-\nu_k x_k\|_2\leq tol$, where $tol$ is a user prescribed stopping criterion, then $|\lambda_1-\nu_k|\leq tol$.

  4. After computing the dominant eigenpair, we can perform deflation to reduce the given EVD for $A$ to the one of size $n-1$ for $A_1$: $$ \begin{bmatrix} U_{:1} & X \end{bmatrix}^T A \begin{bmatrix} U_{:1} & X \end{bmatrix}= \begin{bmatrix} \lambda_1 & \\ & A_1 \end{bmatrix}, \quad \begin{bmatrix} U_{:1} & X \end{bmatrix} \textrm{orthonormal}, \quad A_1=X^TAX. $$

  5. The EVD of the shifted matrix $A-\mu I$ is $U(\Lambda-\mu I) U^T$.

  6. Inverse iteration requires solving the system of linear equations $(A-\mu I)x_{k+1}= x_k$ for $x_{k+1}$ in each step. At the beginning, LU factorization of $A-\mu I$ needs to be computed, which requires $2n^3/3$ operations. In each subsequent step, two triangular systems need to be solved, which requires $2n^2$ operations.

  7. If $\mu$ is close to some eigenvalue of $A$, the eigenvalues of the shifted matrix satisfy $|\lambda_1|\gg |\lambda_2|\geq\cdots\geq |\lambda_n|$, so the convergence of the inverse iteration method is fast.

  8. If $\mu$ is very close to some eigenvalue of $A$, then the matrix $A-\mu I$ is nearly singular, so the solutions of linear systems may have large errors. However, these errors are almost entirely in the direction of the dominant eigenvector so the inverse iteration method is both fast and accurate!!

  9. We can further increase the speed of convergence of inverse iterations by substituting the shift $\mu$ with the Rayleigh quotient $\nu_k$ at the cost of computing new LU factorization.

  10. Matrices $A_k$ and $A_{k+1}$ from both QR iterations are orthogonally similar, $A_{k+1}=Q_k^TA_kQ_k$.

  11. The QR iteration method is essentially equivalent to the power method and the shifted QR iteration method is essentially equivalent to the inverse power method on the shifted matrix.

  12. The straightforward application of the QR iteration requires $O(n^3)$ operations per step, so better implementation is needed.

Examples

In order to keep the programs simple, in the examples below we do not compute full matrix of eigenvectors.


In [1]:
function myPower(A::Matrix,x::Vector,tol::Real)
    y=A*x
    ν=x⋅y
    steps=1
    while norm(y-ν*x)>tol
        x=y/norm(y)
        y=A*x
        ν=x⋅y
        # display(x)
        steps+=1
    end
    ν, y/norm(y), steps
end


Out[1]:
myPower (generic function with 1 method)

In [2]:
import Random
Random.seed!(421)
using LinearAlgebra
n=6
A=Matrix(Symmetric(rand(-9:9,n,n)))


Out[2]:
6×6 Array{Int64,2}:
  9   5  -3  4  -8  -6
  5  -3   3  9  -5   4
 -3   3   4  8  -4   6
  4   9   8  4   4   1
 -8  -5  -4  4   2   9
 -6   4   6  1   9   2

In [3]:
ν,x,steps=myPower(A,ones(n),1e-10)


Out[3]:
(21.061473427807602, [-0.6939421250694533, -0.19708219091193593, 0.08496442837367736, -0.07925651825421791, 0.5182952923959164, 0.44437864168412405], 194)

In [4]:
λ=eigvals(A)


Out[4]:
6-element Array{Float64,1}:
 -16.79970991489491
  -6.223271345941228
  -2.3171547974910998
   4.000831230433496
  18.27783140008613
  21.061473427807595

In [5]:
# Speed of convergence
(18/21)^100


Out[5]:
2.0198589217018641e-7

In [7]:
findmax(broadcast(abs,λ))


Out[7]:
(21.061473427807595, 6)

In [8]:
k=findmax(broadcast(abs,λ))[2]


Out[8]:
6

In [9]:
[eigvecs(A)[:,k] x]


Out[9]:
6×2 Array{Float64,2}:
 -0.693942   -0.693942
 -0.197082   -0.197082
  0.0849644   0.0849644
 -0.0792565  -0.0792565
  0.518295    0.518295
  0.444379    0.444379

In [10]:
# Deflation
function myDeflation(A::Matrix,x::Vector)
    X,R=qr(x)
    # To make sure the returned matrix symmetric use
    # full(Symmetric(X[:,2:end]'*A*X[:,2:end]))
    # display(X'*A*X)
    X[:,2:end]'*A*X[:,2:end]
end


Out[10]:
myDeflation (generic function with 1 method)

In [11]:
A₁=myDeflation(A,x)


Out[11]:
5×5 Array{Float64,2}:
 -4.32672   3.67021  8.23502  -2.11002   6.37787
  3.67021   3.66871  8.3693   -5.50428   4.75335
  8.23502   8.3693   3.59929   5.77085   2.47811
 -2.11002  -5.50428  5.77085  -4.02468   4.09738
  6.37787   4.75335  2.47811   4.09738  -1.97807

In [12]:
eigvals(A₁)


Out[12]:
5-element Array{Float64,1}:
 -16.7997099148949
  -6.223271345941226
  -2.3171547974911015
   4.000831230433492
  18.277831400086136

In [13]:
myPower(A₁,ones(size(A₁,1)),1e-10)


Out[13]:
(18.277831400086143, [0.4129612522067296, 0.5580725733889244, 0.6281234215797172, 0.049637010061582575, 0.3478723063166265], 301)

In [14]:
# Put it all together - eigenvectors are omitted for the sake of simplicty
function myPowerMethod(A::Matrix{T}, tol::Real) where T
    n=size(A,1)
    S =  T==Int ? Float64 : T
    λ=Vector{S}(undef,n)
    for i=1:n
        λ[i],x,steps=myPower(A,ones(n-i+1),tol)
        A=myDeflation(A,x)
    end
    λ
end


Out[14]:
myPowerMethod (generic function with 1 method)

In [15]:
myPowerMethod(A,1e-10)


Out[15]:
6-element Array{Float64,1}:
  21.061473427807602
  18.277831400086143
 -16.799709914894905
  -6.223271345941228
   4.0008312304334925
  -2.317154797491101

In [16]:
# QR iteration
function myQRIteration(A::Matrix, tol::Real)
    steps=1
    while norm(tril(A,-1))>tol
        Q,R=qr(A)
        A=R*Q
        steps+=1
    end
    A,steps
end


Out[16]:
myQRIteration (generic function with 1 method)

In [17]:
B,steps=myQRIteration(A,1e-5)
B


Out[17]:
6×6 Array{Float64,2}:
 21.0615        -2.78909e-12     7.96303e-16   …   2.48755e-15  -2.72846e-15
 -2.78604e-12   18.2778         -9.876e-6         -6.14606e-15  -2.38195e-15
 -6.25578e-18   -9.876e-6      -16.7997            6.9766e-15    5.74499e-16
 -5.69647e-95   -3.23163e-83    -4.55736e-77       2.07947e-15   4.4179e-16
 -5.80759e-130  -2.45137e-119    1.27073e-112      4.00083      -1.82564e-15
 -3.87577e-173   8.95919e-162    2.2339e-155   …   1.08137e-43  -2.31715

In [18]:
steps


Out[18]:
182

In [19]:
diag(B)


Out[19]:
6-element Array{Float64,1}:
  21.061473427807684
  18.277831400083393
 -16.799709914892162
  -6.223271345941225
   4.000831230433501
  -2.317154797491102

Tridiagonalization

The following implementation of $QR$ iteration requires a total of $O(n^3)$ operations:

  1. Reduce $A$ to tridiagonal form $T$ by orthogonal similarities, $X^TAX=T$.
  2. Compute the EVD of $T$ with QR iterations, $T=Q\Lambda Q^T$.
  3. Multiply $U=XQ$.

One step of QR iterations on $T$ requires $O(n)$ operations if only $\Lambda$ is computed, and $O(n^2)$ operations if $Q$ is accumulated, as well.

Definitions

Given vector $v$, Householder reflector is a symmetric orthogonal matrix

$$ H= I - 2\, \frac{v v^T}{v^Tv}. $$

Given $c=\cos\varphi$, $s=\sin\varphi$, and indices $i<j$, Givens rotation matrix is an orthogonal matrix $G(c,s,i,j)$ which is equal to the identity matrix except for elements

$$ G_{ii}=G_{jj}=c,\qquad G_{ij}=-G_{ji}=s. $$

Facts

  1. Given vector $x$, choosing $v=x+\mathrm{\mathop{sign}}(x_{1})\,\|x\|_2\, e_1$ yields the Householder reflector which performs the QR factorization of $x$: $$ Hx=-\mathop{\mathrm{sign}}(x_{1})\, \|x\|_2\, e_1. $$

  2. Given a 2-dimensional vector $\begin{bmatrix}x\\y\end{bmatrix}$, choosing $r= \sqrt{x^2+y^2}$, $c=\frac{x}{r}$ and $s=\frac{y}{r}$, gives the Givens roatation matrix such that $$ G(c,s,1,2)\cdot \begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}r\\ 0 \end{bmatrix}. $$ The hypotenuse $r$ is computed using the hypot() function in order to avoid underflow or overflow.

  3. Tridiagonal form is not unique.

  4. The reduction of $A$ to tridiagonal matrix by Householder reflections is performed as follows. Let $$ A=\begin{bmatrix} \alpha & a^T \\ a & B \end{bmatrix}. $$ Let $v=a+\mathrm{\mathop{sign}}(a_{1})\,\|a\|_2\, e_1$, let $H$ be the corresponding Householder reflector and set $$ H_1=\begin{bmatrix} 1 & \\ & H \end{bmatrix}. $$ Then $$ H_1AH_1= \begin{bmatrix} \alpha & a^T H \\ Ha & HBH \end{bmatrix} = \begin{bmatrix} \alpha & \nu e_1^T \\ \nu e_1 & A_1 \end{bmatrix}, \quad \nu=-\mathop{\mathrm{sign}}(a_{1})\, \|a\|_2. $$ This step annihilates all elements in the first column below the first subdiagonal and all elements in the first row to the right of the first subdiagonal. Applying this procedure recursively yields the tridiagonal matrix $T=X^T AX$, where $X=H_1H_2\cdots H_{n-2}$.

  5. $H$ does not depend on the normalization of $v$. With the normalization $v_1=1$, $a_{2:n-1}$ can be overwritten by $v_{2:n-1}$, so $v_1$ does not need to be stored.

  6. $H$ is not formed explicitly - given $v$, $B$ is overwritten with $HBH$ in $O(n^2)$ operations by using one matrix-vector multiplication and two rank-one updates.

  7. When symmetry is exploited in performing rank-2 update, tridiagonalization requires $4n^3/3$ operations. Instead of performing rank-2 update on $B$, one can accumulate $p$ transformations and perform rank-$2p$ update. This block algorithm is rich in matrix--matrix multiplications (roughly one half of the operations is performed using BLAS 3 routines), but it requires extra workspace for $U$ and $V$.

  8. If $X$ is needed explicitly, it can be computed from the stored Householder vectors $v$. In order to minimize the operation count, the computation starts from the smallest matrix and the size is gradually increased: $$ H_{n-2},\quad H_{n-3}H_{n-2},\dots,\quad X=H_1\cdots H_{n-2}. $$ A column-oriented version is possible as well, and the operation count in both cases is $4n^3/3$. If the Householder reflectors $H_i$ are accumulated in the order in which they are generated, the operation count is $2n^3$.

  9. The backward error bounds for functions myTridiag() and myTridiagX() are as follows: The computed matrix $\tilde T$ is equal to the matrix which would be obtained by exact tridiagonalization of some perturbed matrix $A+\Delta A$, where $\|\Delta A\|_2 \leq \psi \varepsilon \|A\|_2$ and $\psi$ is a slowly increasing function of $n$. The computed matrix $\tilde X$ satisfies $\tilde X=X+\Delta X$, where $\|\Delta X \|_2\leq \phi \varepsilon$ and $\phi$ is a slowly increasing function of $n$.

  10. Tridiagonalization using Givens rotations requires $\frac{(n-1)(n-2)}{2}$ plane rotations, which amounts to $4n^3$ operations if symmetry is properly exploited. The operation count is reduced to $8n^3/3$ if fast rotations are used. Fast rotations are obtained by factoring out absolutely larger of $c$ and $s$ from $G$.

  11. Givens rotations in the function myTridiagG() can be performed in different orderings. For example, the elements in the first column and row can be annihilated by rotations in the planes $(n-1,n)$, $(n-2,n-1)$, $\ldots$, $(2,3)$. Givens rotations act more selectively than Householder reflectors, and are useful if $A$ has some special structure, for example, if $A$ is a banded matrix.

  12. Error bounds for function myTridiagG() are the same as above, but with slightly different functions $\psi$ and $\phi$.

  13. The block version of tridiagonal reduction is implemented in the LAPACK subroutine DSYTRD. The computation of $X$ is implemented in the subroutine DORGTR. The size of the required extra workspace (in elements) is $lwork=nb*n$, where $nb$ is the optimal block size (here, $nb=64)$, and it is determined automatically by the subroutines. The subroutine DSBTRD tridiagonalizes a symmetric band matrix by using Givens rotations. There are no Julia wappers for these routines yet!

Examples


In [20]:
function myTridiag(A::Matrix)
    # Normalized Householder vectors are stored in the lower 
    # triangular part of A below the first subdiagonal
    n=size(A,1)
    T=Float64
    A=map(T,A)
    v=Vector{T}(undef,n)
    Trid=SymTridiagonal(zeros(n),zeros(n-1))
    for j = 1 : n-2
        μ = sign(A[j+1,j])*norm(A[j+1:n, j])
        if μ != zero(T)
            β =A[j+1,j]+μ
            v[j+2:n] = A[j+2:n,j] / β
        end
        A[j+1,j]=-μ
        A[j,j+1]=-μ
        v[j+1] = one(T)
        γ = -2 / (v[j+1:n]v[j+1:n])
        w = γ* A[j+1:n, j+1:n]*v[j+1:n]
        q = w + γ * v[j+1:n]*(v[j+1:n]w) / 2 
        A[j+1:n, j+1:n] = A[j+1:n,j+1:n] + v[j+1:n]*q' + q*v[j+1:n]'
        A[j+2:n, j] = v[j+2:n]
    end
    SymTridiagonal(diag(A),diag(A,1)), tril(A,-2)
end


Out[20]:
myTridiag (generic function with 1 method)

In [21]:
A


Out[21]:
6×6 Array{Int64,2}:
  9   5  -3  4  -8  -6
  5  -3   3  9  -5   4
 -3   3   4  8  -4   6
  4   9   8  4   4   1
 -8  -5  -4  4   2   9
 -6   4   6  1   9   2

In [22]:
T,H=myTridiag(A)


Out[22]:
([9.0 -12.24744871391589 … 0.0 0.0; -12.24744871391589 6.98 … 0.0 0.0; … ; 0.0 0.0 … 9.110305732497977 -10.030701389069694; 0.0 0.0 … -10.030701389069694 4.583723406290584], [0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; -0.463836717690617 -0.11816614373061109 … 0.0 0.0; -0.34787753826796275 -0.31696739345222574 … 0.0 0.0])

In [23]:
T


Out[23]:
6×6 SymTridiagonal{Float64,Array{Float64,1}}:
   9.0     -12.2474     ⋅         ⋅          ⋅          ⋅ 
 -12.2474    6.98      5.77982    ⋅          ⋅          ⋅ 
    ⋅        5.77982  -2.08843   8.03697     ⋅          ⋅ 
    ⋅         ⋅        8.03697  -9.5856    -6.81167     ⋅ 
    ⋅         ⋅         ⋅       -6.81167    9.11031  -10.0307
    ⋅         ⋅         ⋅         ⋅       -10.0307     4.58372

In [24]:
[eigvals(A) eigvals(T)]


Out[24]:
6×2 Array{Float64,2}:
 -16.7997   -16.7997
  -6.22327   -6.22327
  -2.31715   -2.31715
   4.00083    4.00083
  18.2778    18.2778
  21.0615    21.0615

In [25]:
H


Out[25]:
6×6 Array{Float64,2}:
  0.0        0.0        0.0       0.0        0.0  0.0
  0.0        0.0        0.0       0.0        0.0  0.0
 -0.173939   0.0        0.0       0.0        0.0  0.0
  0.231918  -0.224803   0.0       0.0        0.0  0.0
 -0.463837  -0.118166  -0.943744  0.0        0.0  0.0
 -0.347878  -0.316967   0.104262  0.0401482  0.0  0.0

In [26]:
v1=[1;H[3:6,1]]


Out[26]:
5-element Array{Float64,1}:
  1.0
 -0.17393876913398137
  0.2319183588453085
 -0.463836717690617
 -0.34787753826796275

In [27]:
H0=I-2*v1*v1'/(v1'*v1)


Out[27]:
5×5 Array{Float64,2}:
 -0.408248   0.244949   -0.326599    0.653197   0.489898
  0.244949   0.957394    0.0568082  -0.113616  -0.0852122
 -0.326599   0.0568082   0.924256    0.151488   0.113616
  0.653197  -0.113616    0.151488    0.697023  -0.227233
  0.489898  -0.0852122   0.113616   -0.227233   0.829576

In [28]:
H1=[1 zeros(1,5);zeros(5,1) H0]


Out[28]:
6×6 Array{Float64,2}:
 1.0   0.0        0.0         0.0         0.0        0.0
 0.0  -0.408248   0.244949   -0.326599    0.653197   0.489898
 0.0   0.244949   0.957394    0.0568082  -0.113616  -0.0852122
 0.0  -0.326599   0.0568082   0.924256    0.151488   0.113616
 0.0   0.653197  -0.113616    0.151488    0.697023  -0.227233
 0.0   0.489898  -0.0852122   0.113616   -0.227233   0.829576

In [29]:
H1*A*H1


Out[29]:
6×6 Array{Float64,2}:
   9.0          -12.2474    4.16334e-16  …   1.11022e-15  6.66134e-16
 -12.2474         6.98     -4.14289          1.17253      3.14517
   0.0           -4.14289   6.18291         -2.56569      8.02965
  -3.33067e-16    2.23065   7.92347          9.64497      3.96183
   1.11022e-15    1.17253  -2.56569         -5.87323      5.63886
   8.88178e-16    3.14517   8.02965      …   5.63886      1.38698

In [30]:
# Extract X
function myTridiagX(H::Matrix)
    n=size(H,1)
    T=Float64
    X = Matrix{T}(I,n,n)
    v=Vector{T}(undef,n)
    for j = n-2 : -1 : 1
        v[j+1] = one(T)
        v[j+2:n] = H[j+2:n, j]
        γ = -2 / (v[j+1:n]v[j+1:n])
        w = γ * X[j+1:n, j+1:n]'*v[j+1:n]
        X[j+1:n, j+1:n] = X[j+1:n, j+1:n] + v[j+1:n]*w'
    end
    X
end


Out[30]:
myTridiagX (generic function with 1 method)

In [31]:
X=myTridiagX(H)


Out[31]:
6×6 Array{Float64,2}:
 1.0   0.0        0.0         0.0         0.0         0.0
 0.0  -0.408248   0.0974742   0.622022    0.142048    0.645556
 0.0   0.244949  -0.73374     0.0135138  -0.516899    0.366412
 0.0  -0.326599   0.408544    0.0651098  -0.837043   -0.14678
 0.0   0.653197   0.157654    0.689721   -0.0790418  -0.257908
 0.0   0.489898   0.510256   -0.364626   -0.0758165   0.600781

In [32]:
# Fact 7: norm(ΔX)<ϕ*eps()
X'*X


Out[32]:
6×6 Array{Float64,2}:
 1.0   0.0           0.0           0.0           0.0           0.0
 0.0   1.0          -1.20734e-16  -1.50768e-16  -9.19151e-17  -2.83227e-16
 0.0  -1.20734e-16   1.0           2.93267e-17   1.55967e-16  -5.61122e-17
 0.0  -1.50768e-16   2.93267e-17   1.0           2.00692e-16   1.72259e-16
 0.0  -9.19151e-17   1.55967e-16   2.00692e-16   1.0           6.37778e-17
 0.0  -2.83227e-16  -5.61122e-17   1.72259e-16   6.37778e-17   1.0

In [33]:
X'*A*X


Out[33]:
6×6 Array{Float64,2}:
   9.0          -12.2474        6.66134e-16  …    1.08247e-15    3.33067e-15
 -12.2474         6.98          5.77982           8.04185e-16   -2.07278e-15
   8.88178e-16    5.77982      -2.08843           1.46696e-15   -2.66763e-15
   8.88178e-16    9.13998e-17   8.03697          -6.81167        2.94038e-15
   1.11022e-15    1.66748e-17   2.28269e-15       9.11031      -10.0307
   3.55271e-15   -2.34076e-15  -1.63976e-15  …  -10.0307         4.58372

In [34]:
# Tridiagonalization using Givens rotations
function myTridiagG(A::Matrix)
    n=size(A,1)
    T=Float64
    X=Matrix{T}(I,n,n)
    for j = 1 : n-2
        for i = j+2 : n
            G,r=givens(A,j+1,i,j)
            A=(G*A)*G'
            # display(A)
            X*=G'
        end
    end
    SymTridiagonal(diag(A),diag(A,1)), X
end


Out[34]:
myTridiagG (generic function with 1 method)

In [35]:
# Let us take a look at the `givens()` functions
methods(givens)


Out[35]:
# 3 methods for generic function givens:

In [36]:
?givens


search: givens

Out[36]:
givens(f::T, g::T, i1::Integer, i2::Integer) where {T} -> (G::Givens, r::T)

Computes the Givens rotation G and scalar r such that for any vector x where

x[i1] = f
x[i2] = g

the result of the multiplication

y = G*x

has the property that

y[i1] = r
y[i2] = 0

See also: LinearAlgebra.Givens


givens(A::AbstractArray, i1::Integer, i2::Integer, j::Integer) -> (G::Givens, r)

Computes the Givens rotation G and scalar r such that the result of the multiplication

B = G*A

has the property that

B[i1,j] = r
B[i2,j] = 0

See also: LinearAlgebra.Givens


givens(x::AbstractVector, i1::Integer, i2::Integer) -> (G::Givens, r)

Computes the Givens rotation G and scalar r such that the result of the multiplication

B = G*x

has the property that

B[i1] = r
B[i2] = 0

See also: LinearAlgebra.Givens


In [37]:
Tg,Xg=myTridiagG(map(Float64,A))


Out[37]:
([9.0 12.247448713915892 … 0.0 0.0; 12.247448713915892 6.98 … 0.0 0.0; … ; 0.0 0.0 … 9.110305732497977 10.030701389069696; 0.0 0.0 … 10.030701389069696 4.583723406290594], [1.0 0.0 … 0.0 0.0; 0.0 0.4082482904638629 … -0.14204829345405623 0.6455564660494066; … ; 0.0 -0.653197264742181 … 0.07904182514164797 -0.25790785656125237; 0.0 -0.48989794855663565 … 0.07581647320183352 0.6007814442628551])

In [38]:
Tg


Out[38]:
6×6 SymTridiagonal{Float64,Array{Float64,1}}:
  9.0     12.2474     ⋅         ⋅         ⋅         ⋅ 
 12.2474   6.98      5.77982    ⋅         ⋅         ⋅ 
   ⋅       5.77982  -2.08843   8.03697    ⋅         ⋅ 
   ⋅        ⋅        8.03697  -9.5856   -6.81167    ⋅ 
   ⋅        ⋅         ⋅       -6.81167   9.11031  10.0307
   ⋅        ⋅         ⋅         ⋅       10.0307    4.58372

In [39]:
Xg'*Xg


Out[39]:
6×6 Array{Float64,2}:
 1.0  0.0           0.0           0.0           0.0           0.0
 0.0  1.0           4.69211e-17   1.26227e-16   1.36396e-17   6.34259e-17
 0.0  4.69211e-17   1.0          -6.84002e-17  -1.7415e-17   -1.57021e-16
 0.0  1.26227e-16  -6.84002e-17   1.0           2.6043e-18    1.25154e-16
 0.0  1.36396e-17  -1.7415e-17    2.6043e-18    1.0          -4.34495e-18
 0.0  6.34259e-17  -1.57021e-16   1.25154e-16  -4.34495e-18   1.0

In [40]:
Xg'*A*Xg


Out[40]:
6×6 Array{Float64,2}:
  9.0          12.2474        4.44089e-16  …  -1.16573e-15   1.11022e-15
 12.2474        6.98          5.77982          7.42096e-16  -7.63126e-16
  4.44089e-16   5.77982      -2.08843         -1.6576e-16    1.66283e-15
  0.0           1.45234e-16   8.03697         -6.81167      -2.42829e-15
 -1.16573e-15   4.87681e-16   6.08135e-17      9.11031      10.0307
  8.88178e-16  -3.14801e-16   1.35468e-15  …  10.0307        4.58372

In [41]:
T


Out[41]:
6×6 SymTridiagonal{Float64,Array{Float64,1}}:
   9.0     -12.2474     ⋅         ⋅          ⋅          ⋅ 
 -12.2474    6.98      5.77982    ⋅          ⋅          ⋅ 
    ⋅        5.77982  -2.08843   8.03697     ⋅          ⋅ 
    ⋅         ⋅        8.03697  -9.5856    -6.81167     ⋅ 
    ⋅         ⋅         ⋅       -6.81167    9.11031  -10.0307
    ⋅         ⋅         ⋅         ⋅       -10.0307     4.58372

In [42]:
Q,R=qr(Matrix(T))


Out[42]:
LinearAlgebra.QRCompactWY{Float64,Array{Float64,2}}
Q factor:
6×6 LinearAlgebra.QRCompactWYQ{Float64,Array{Float64,2}}:
 -0.592157  -0.567632  -0.0677113   0.399955   -0.2305      0.330856
  0.805823  -0.417123  -0.0497575   0.293906   -0.169382    0.243129
  0.0        0.709791  -0.0833909   0.49257    -0.283875    0.407471
  0.0        0.0       -0.992968   -0.0833676   0.0480459  -0.0689645
  0.0        0.0        0.0        -0.709986   -0.402551    0.577817
  0.0        0.0        0.0         0.0         0.820511    0.57163
R factor:
6×6 Array{Float64,2}:
 -15.1987  12.8771    4.65751  0.0        0.0       0.0
   0.0      8.14299  -3.89324  5.70457    0.0       0.0
   0.0      0.0      -8.09389  8.84799    6.76377   0.0
   0.0      0.0       0.0      9.59409   -5.90032   7.12166
   0.0      0.0       0.0      0.0      -12.2249    7.79887
   0.0      0.0       0.0      0.0        0.0      -3.17571

In [43]:
R*Q


Out[43]:
6×6 Array{Float64,2}:
 19.3766    6.56181  -8.88178e-16   1.87596e-15   1.11006e-15   0.0
  6.56181  -6.16001  -5.74497       0.0           8.88178e-16  -4.63697e-16
  0.0      -5.74497  -8.11081      -9.52663      -8.88178e-16   0.0
  0.0       0.0      -9.52663       3.38931       8.67954       8.88178e-16
  0.0       0.0       0.0           8.67954      11.3202       -2.60571
  0.0       0.0       0.0           0.0          -2.60571      -1.81533

Tridiagonal QR method

Let $T$ be a real symmetric tridiagonal matrix of order $n$ and $T=Q\Lambda Q^T$ be its EVD.

Each step of the shifted QR iterations can be elegantly implemented without explicitly computing the shifted matrix $T-\mu I$.

Definition

Wilkinson's shift $\mu$ is the eigenvalue of the bottom right $2\times 2$ submatrix of $T$, which is closer to $T_{n,n}$.

Facts

  1. The stable formula for the Wilkinson's shift is $$ \mu=T_{n,n}- \frac{T_{n,n-1}^2}{\tau+\mathop{\mathrm{sign}}(\tau)\sqrt{\tau^2+T_{n,n-1}^2}},\qquad \tau=\frac{T_{n-1,n-1}-T_{n,n}}{2}. $$

  2. Wilkinson's shift is the most commonly used shift. With Wilkinson's shift, the algorithm always converges in the sense that $T_{n-1,n}\to 0$. The convergence is quadratic, that is, $|[T_{k+1}]_{n-1,n}|\leq c |[T_{k}]_{n-1,n}|^2$ for some constant $c$, where $T_k$ is the matrix after the $k$-th sweep. Even more, the convergence is usually cubic. However, it can also happen that some $T_{i,i+i}$, $i\neq n-1$, becomes sufficiently small before $T_{n-1,n}$, so the practical program has to check for deflation at each step.

  3. Chasing the Bulge. The plane rotation parameters at the start of the sweep are computed as if the shifted $T-\mu I$ has been formed. Since the rotation is applied to the original $T$ and not to $T-\mu I$, this creates new nonzero elements at the positions $(3,1)$ and $(1, 3)$, the so-called bulge. The subsequent rotations simply chase the bulge out of the lower right corner of the matrix. The rotation in the $(2,3)$ plane sets the elements $(3,1)$ and $(1,3)$ back to zero, but it generates two new nonzero elements at positions $(4,2)$ and $(2,4)$; the rotation in the $(3,4)$ plane sets the elements $(4,2)$ and $(2,4)$ back to zero, but it generates two new nonzero elements at positions $(5,3)$ and $(3,5)$, etc.

  4. Implicit $Q$ Theorem. The effect of this procedure is the following. At the end of the first sweep, the resulting matrix $T_1$ is equal to the the matrix that would have been obtained by factorizing $T-\mu I=QR$ and computing $T_1=RQ+\mu I$.

  5. Since the convergence of the function myTridEigQR() is quadratic (or even cubic), an eigenvalue is isolated after just a few steps, which requires $O(n)$ operations. This means that $O(n^2)$ operations are needed to compute all eigenvalues.

  6. If the eigenvector matrix $Q$ is desired, the plane rotations need to be accumulated similarly to the accumulation of $X$ in the function myTridiagG(). This accumulation requires $O(n^3)$ operations. Another, faster, algorithm to first compute only $\Lambda$ and then compute $Q$ using inverse iterations. Inverse iterations on a tridiagonal matrix are implemented in the LAPACK routine DSTEIN.

  7. Error bounds. Let $U\Lambda U^T$ and $\tilde U \tilde \Lambda \tilde U^T$ be the exact and the computed EVDs of $A$, respectively, such that the diagonals of $\Lambda$ and $\tilde \Lambda$ are in the same order. Numerical methods generally compute the EVD with the errors bounded by $$ |\lambda_i-\tilde \lambda_i|\leq \phi \epsilon\|A\|_2, \qquad \|u_i-\tilde u_i\|_2\leq \psi\epsilon \frac{\|A\|_2} {\min_{j\neq i} |\lambda_i-\tilde \lambda_j|}, $$ where $\epsilon$ is machine precision and $\phi$ and $\psi$ are slowly growing polynomial functions of $n$ which depend upon the algorithm used (typically $O(n)$ or $O(n^2)$). Such bounds are obtained by combining perturbation bounds with the floating-point error analysis of the respective algorithms.

  8. The eigenvalue decomposition $T=Q\Lambda Q^T$ computed by myTridEigQR() satisfies the error bounds from fact 7. with $A$ replaced by $T$ and $U$ replaced by $Q$. The deflation criterion implies $|T_{i,i+1}|\leq \epsilon \|T\|_F$, which is within these bounds.

  9. The EVD computed by function mySymEigQR() satisfies the error bounds given in Fact 7. However, the algorithm tends to perform better on matrices, which are graded downwards, that is, on matrices that exhibit systematic decrease in the size of the matrix elements as we move along the diagonal.
    For such matrices the tiny eigenvalues can usually be computed with higher relative accuracy (although counterexamples can be easily constructed). If the tiny eigenvalues are of interest, it should be checked whether there exists a symmetric permutation that moves larger elements to the upper left corner, thus converting the given matrix to the one that is graded downwards.

  10. The function myTridEigQR() is implemented in the LAPACK subroutine DSTEQR. This routine can compute just the eigenvalues, or both eigenvalues and eigenvectors.

  11. The function mySymEigQR() is Algorithm 5 is implemented in the functions eig(), eigvals() and eigvecs(), and in the LAPACK routine DSYEV. To compute only eigenvalues, DSYEV calls DSYTRD and DSTEQR without the eigenvector option. To compute both eigenvalues and eigenvectors, DSYEV calls DSYTRD, DORGTR, and DSTEQR with the eigenvector option.

Examples


In [44]:
function myTridEigQR(A1::SymTridiagonal)
    A=deepcopy(A1)
    n=length(A.dv)
    T=Float64
    λ=Vector{T}(undef,n)
    B=Matrix{T}
    if n==1
        return map(T,A.dv)
    end
    if n==2
        τ=(A.dv[end-1]-A.dv[end])/2
        μ=A.dv[end]-A.ev[end]^2/(τ+sign(τ)*sqrt(τ^2+A.ev[end]^2))
        # Only rotation
        B=A[1:2,1:2]
        G,r=givens(B-μ*I,1,2,1)
        B=(G*B)*G'
        return diag(B)[1:2]
    end
    steps=1
    k=0
    while k==0 && steps<=10
        # Shift
        τ=(A.dv[end-1]-A.dv[end])/2
        μ=A.dv[end]-A.ev[end]^2/(τ+sign(τ)*sqrt(τ^2+A.ev[end]^2))
        # First rotation
        B=A[1:3,1:3]
        G,r=givens(B-μ*I,1,2,1)
        B=(G*B)*G'
        A.dv[1:2]=diag(B)[1:2]
        A.ev[1:2]=diag(B,-1)
        bulge=B[3,1]
        # Bulge chasing
        for i = 2 : n-2
            B=A[i-1:i+2,i-1:i+2]
            B[3,1]=bulge
            B[1,3]=bulge
            G,r=givens(B,2,3,1)
            B=(G*B)*G'
            A.dv[i:i+1]=diag(B)[2:3]
            A.ev[i-1:i+1]=diag(B,-1)
            bulge=B[4,2]
        end
        # Last rotation
        B=A[n-2:n,n-2:n]
        B[3,1]=bulge
        B[1,3]=bulge
        G,r=givens(B,2,3,1)
        B=(G*B)*G'
        A.dv[n-1:n]=diag(B)[2:3]
        A.ev[n-2:n-1]=diag(B,-1)
        steps+=1
        # Deflation criterion
        k=findfirst(abs.(A.ev) .< sqrt.(abs.(A.dv[1:n-1].*A.dv[2:n]))*eps(T))
        k=k==nothing ? 0 : k
        # display(A)
    end
    λ[1:k]=myTridEigQR(SymTridiagonal(A.dv[1:k],A.ev[1:k-1]))
    λ[k+1:n]=myTridEigQR(SymTridiagonal(A.dv[k+1:n],A.ev[k+1:n-1]))
    λ
end


Out[44]:
myTridEigQR (generic function with 1 method)

In [45]:
λ=eigvals(T)


Out[45]:
6-element Array{Float64,1}:
 -16.7997099148949
  -6.223271345941231
  -2.317154797491101
   4.000831230433498
  18.277831400086136
  21.061473427807595

In [46]:
T


Out[46]:
6×6 SymTridiagonal{Float64,Array{Float64,1}}:
   9.0     -12.2474     ⋅         ⋅          ⋅          ⋅ 
 -12.2474    6.98      5.77982    ⋅          ⋅          ⋅ 
    ⋅        5.77982  -2.08843   8.03697     ⋅          ⋅ 
    ⋅         ⋅        8.03697  -9.5856    -6.81167     ⋅ 
    ⋅         ⋅         ⋅       -6.81167    9.11031  -10.0307
    ⋅         ⋅         ⋅         ⋅       -10.0307     4.58372

In [47]:
λ₁=myTridEigQR(T)


Out[47]:
6-element Array{Float64,1}:
  21.06147342780759
  18.277831400086132
 -16.799709914894905
   4.0008312304335
  -6.22327134594123
  -2.3171547974911006

In [48]:
(sort(λ)-sort(λ₁))./sort(λ)


Out[48]:
6-element Array{Float64,1}:
 -2.1147470383703508e-16
  1.4271889659437529e-16
  1.9165280210493494e-16
 -4.439969439070237e-16
  1.9437282252115306e-16
  1.6868305491437418e-16

Computing the eigenvectors

Once the eigenvalues are computed, the eigeenvectors can be efficiently computed with inverse iterations. Inverse iterations for tridiagonal matrices are implemented in the LAPACK routine DSTEIN.


In [49]:
?LAPACK.stein!


Out[49]:
stein!(dv, ev_in, w_in, iblock_in, isplit_in)

Computes the eigenvectors for a symmetric tridiagonal matrix with dv as diagonal and ev_in as off-diagonal. w_in specifies the input eigenvalues for which to find corresponding eigenvectors. iblock_in specifies the submatrices corresponding to the eigenvalues in w_in. isplit_in specifies the splitting points between the submatrix blocks.


In [50]:
U=LAPACK.stein!(T.dv,T.ev,sort(λ₁))


Out[50]:
6×6 Array{Float64,2}:
  0.0763052   0.500654   0.324619   0.383753   -0.0965598   0.693942
  0.16074     0.6223     0.299962   0.156641    0.0731471  -0.683405
 -0.499635   -0.36068    0.205364   0.732435   -0.0616295  -0.194526
  0.798961   -0.261967  -0.221563   0.442285   -0.208778   -0.0688452
  0.256653   -0.29625    0.478726  -0.0179859   0.781298    0.0802303
  0.120393   -0.274969   0.695847  -0.30951    -0.572287   -0.0488396

In [51]:
# Orthogonality
norm(U'*U-I)


Out[51]:
4.599392953031193e-16

In [52]:
# Residual
norm(T*U-U*Diagonal(sort(λ₁)))


Out[52]:
1.784465511745828e-14

In [53]:
# Some timings - n=100, 200, 400
n=400
Tbig=SymTridiagonal(rand(n),rand(n-1))
@time myTridEigQR(Tbig);
@time λbig=eigvals(Tbig);
@time LAPACK.stein!(Tbig.dv,Tbig.ev,λbig);


  0.526962 seconds (7.03 M allocations: 627.377 MiB, 9.96% gc time)
  0.003252 seconds (5 allocations: 12.984 KiB)
  0.013695 seconds (17 allocations: 1.266 MiB)

In [54]:
n=2000
Tbig=SymTridiagonal(rand(n),rand(n-1))
@time λbig=eigvals(Tbig);
@time U=LAPACK.stein!(Tbig.dv,Tbig.ev,λbig);
@time eigen(Tbig);


  0.076606 seconds (6 allocations: 62.938 KiB)
  0.379448 seconds (18 allocations: 30.722 MiB, 1.17% gc time)
  0.713664 seconds (299.92 k allocations: 46.141 MiB, 0.48% gc time)

Alternatively, the rotations in myTridEigQR() can be accumulated to compute the eigenvectors. This is not optimal, but is instructive. We keep the name of the function, using Julia's multiple dispatch feature.


In [55]:
function myTridEigQR(A1::SymTridiagonal,U::Matrix)
    # U is either the identity matrix or the output from myTridiagX()
    A=deepcopy(A1)
    n=length(A.dv)
    T=Float64
    λ=Vector{T}(undef,n)
    B=Matrix{T}
    if n==1
        return map(T,A.dv), U
    end
    if n==2
        τ=(A.dv[end-1]-A.dv[end])/2
        μ=A.dv[end]-A.ev[end]^2/(τ+sign(τ)*sqrt(τ^2+A.ev[end]^2))
        # Only rotation
        B=A[1:2,1:2]
        G,r=givens(B-μ*I,1,2,1)
        B=(G*B)*G'
        U*=G'
        return diag(B)[1:2], U
    end
    steps=1
    k=0
    while k==0 && steps<=10
        # Shift
        τ=(A.dv[end-1]-A.dv[end])/2
        μ=A.dv[end]-A.ev[end]^2/(τ+sign(τ)*sqrt(τ^2+A.ev[end]^2))
        # First rotation
        B=A[1:3,1:3]
        G,r=givens(B-μ*I,1,2,1)
        B=(G*B)*G'
        U[:,1:3]*=G'
        A.dv[1:2]=diag(B)[1:2]
        A.ev[1:2]=diag(B,-1)
        bulge=B[3,1]
        # Bulge chasing
        for i = 2 : n-2
            B=A[i-1:i+2,i-1:i+2]
            B[3,1]=bulge
            B[1,3]=bulge
            G,r=givens(B,2,3,1)
            B=(G*B)*G'
            U[:,i-1:i+2]=U[:,i-1:i+2]*G'
            A.dv[i:i+1]=diag(B)[2:3]
            A.ev[i-1:i+1]=diag(B,-1)
            bulge=B[4,2]
        end
        # Last rotation
        B=A[n-2:n,n-2:n]
        B[3,1]=bulge
        B[1,3]=bulge
        G,r=givens(B,2,3,1)
        B=(G*B)*G'
        U[:,n-2:n]*=G'
        A.dv[n-1:n]=diag(B)[2:3]
        A.ev[n-2:n-1]=diag(B,-1)
        steps+=1
        # Deflation criterion
        k=findfirst(abs.(A.ev) .< sqrt.(abs.(A.dv[1:n-1].*A.dv[2:n]))*eps())
        k=k==nothing ? 0 : k
    end
    λ[1:k], U[:,1:k]=myTridEigQR(SymTridiagonal(A.dv[1:k],A.ev[1:k-1]),U[:,1:k])
    λ[k+1:n], U[:,k+1:n]=myTridEigQR(SymTridiagonal(A.dv[k+1:n],A.ev[k+1:n-1]),U[:,k+1:n])
    λ, U
end


Out[55]:
myTridEigQR (generic function with 2 methods)

In [56]:
λ,U=myTridEigQR(T,Matrix{Float64}(I,size(T)))
λ


Out[56]:
6-element Array{Float64,1}:
  21.06147342780759
  18.277831400086132
 -16.799709914894905
   4.0008312304335
  -6.22327134594123
  -2.3171547974911006

In [57]:
U


Out[57]:
6×6 Array{Float64,2}:
  0.693942   -0.0965598  -0.0763052  -0.383753   -0.500654  -0.324619
 -0.683405    0.0731471  -0.16074    -0.156641   -0.6223    -0.299962
 -0.194526   -0.0616295   0.499635   -0.732435    0.36068   -0.205364
 -0.0688452  -0.208778   -0.798961   -0.442285    0.261967   0.221563
  0.0802303   0.781298   -0.256653    0.0179859   0.29625   -0.478726
 -0.0488396  -0.572287   -0.120393    0.30951     0.274969  -0.695847

In [58]:
# Orthogonality
norm(U'*U-I)


Out[58]:
1.360048863945506e-15

In [59]:
# Residual
norm(T*U-U*Diagonal(λ))


Out[59]:
2.1755383380997447e-14

Symmetric QR method

Combining myTridiag(), myTridiagX() and myTridEigQR(), we get the method for computing symmetric EVD.


In [60]:
function mySymEigQR(A::Matrix)
    T,H=myTridiag(A)
    X=myTridiagX(H)
    myTridEigQR(T,X)
end


Out[60]:
mySymEigQR (generic function with 1 method)

In [61]:
A


Out[61]:
6×6 Array{Int64,2}:
  9   5  -3  4  -8  -6
  5  -3   3  9  -5   4
 -3   3   4  8  -4   6
  4   9   8  4   4   1
 -8  -5  -4  4   2   9
 -6   4   6  1   9   2

In [62]:
λ,U=mySymEigQR(map(Float64,A))
λ


Out[62]:
6-element Array{Float64,1}:
  21.06147342780759
  18.277831400086132
 -16.799709914894905
   4.0008312304335
  -6.22327134594123
  -2.3171547974911006

In [63]:
U


Out[63]:
6×6 Array{Float64,2}:
  0.693942   -0.0965598  -0.0763052  -0.383753   -0.500654   -0.324619
  0.197082   -0.424196   -0.496826   -0.0801954   0.671749   -0.276953
 -0.0849644  -0.553229   -0.328222    0.597182   -0.465916    0.0726892
  0.0792565  -0.632641    0.437101   -0.337355    0.0793185   0.531344
 -0.518295   -0.0200926  -0.525949   -0.604088   -0.263271    0.14181
 -0.444379   -0.322541    0.414647   -0.104615   -0.0736083  -0.714283

In [64]:
# Orthogonality 
norm(U'*U-I)


Out[64]:
1.271419493661103e-15

In [65]:
# Residual
norm(A*U-U*Diagonal(λ))


Out[65]:
2.5780268540038566e-14

Unsymmetric matrices

The $QR$ iterations for unsymmetric matrices are implemented as follows:

  1. Reduce $A$ to Hessenberg form form $H$ by orthogonal similarities, $X^TAX=H$.
  2. Compute the EVD of $H$ with QR iterations, $H=Q\Lambda Q^*$.
  3. Multiply $U=XQ$.

The algorithm requires of $O(n^3)$ operations. For more details, see D. Watkins, Unsymmetric Matrix Eigenvalue Techniques and the references therein.


In [66]:
A=rand(-9:9,5,5)


Out[66]:
5×5 Array{Int64,2}:
 -2   5  -3  -1   3
 -6  -3  -8  -7   9
 -4  -6   4  -9   5
  8  -9   7  -9  -1
  9  -5   0   2   8

In [67]:
λ,X=eigen(A)
λ


Out[67]:
5-element Array{Complex{Float64},1}:
 -13.835883548290738 + 0.0im
 -2.1982137589517228 - 8.898841182206134im
 -2.1982137589517228 + 8.898841182206134im
   6.785962600575749 + 0.0im
   9.446348465618422 + 0.0im

In [68]:
X


Out[68]:
5×5 Array{Complex{Float64},2}:
 -0.10983+0.0im    0.14314+0.356492im    …  -0.260512+0.0im  -0.338603+0.0im
 0.592037+0.0im   0.675378-0.0im            -0.232596+0.0im  -0.357307+0.0im
 0.462706+0.0im   0.245017+0.126962im        0.749069+0.0im  -0.162656+0.0im
 0.639042+0.0im  -0.463508-0.00880585im      0.302679+0.0im  0.0121097+0.0im
 0.122302+0.0im  0.0128809-0.324119im        0.474672+0.0im   -0.85503+0.0im

In [69]:
# Residual 
norm(A*X-X*Diagonal(λ))


Out[69]:
2.376942186679338e-14

Hessenberg factorization


In [70]:
?hessenberg


search: hessenberg hessenberg! Hessenberg UpperHessenberg

Out[70]:
hessenberg(A) -> Hessenberg

Compute the Hessenberg decomposition of A and return a Hessenberg object. If F is the factorization object, the unitary matrix can be accessed with F.Q (of type LinearAlgebra.HessenbergQ) and the Hessenberg matrix with F.H (of type UpperHessenberg), either of which may be converted to a regular matrix with Matrix(F.H) or Matrix(F.Q).

If A is Hermitian or real-Symmetric, then the Hessenberg decomposition produces a real-symmetric tridiagonal matrix and F.H is of type SymTridiagonal.

Note that the shifted factorization A+μI = Q (H+μI) Q' can be constructed efficiently by F + μ*I using the UniformScaling object I, which creates a new Hessenberg object with shared storage and a modified shift. The shift of a given F is obtained by F.μ. This is useful because multiple shifted solves (F + μ*I) \ b (for different μ and/or b) can be performed efficiently once F is created.

Iterating the decomposition produces the factors F.Q, F.H, F.μ.

Examples

jldoctest
julia> A = [4. 9. 7.; 4. 4. 1.; 4. 3. 2.]
3×3 Array{Float64,2}:
 4.0  9.0  7.0
 4.0  4.0  1.0
 4.0  3.0  2.0

julia> F = hessenberg(A)
Hessenberg{Float64,UpperHessenberg{Float64,Array{Float64,2}},Array{Float64,2},Array{Float64,1},Bool}
Q factor:
3×3 LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}:
 1.0   0.0        0.0
 0.0  -0.707107  -0.707107
 0.0  -0.707107   0.707107
H factor:
3×3 UpperHessenberg{Float64,Array{Float64,2}}:
  4.0      -11.3137       -1.41421
 -5.65685    5.0           2.0
   ⋅        -8.88178e-16   1.0

julia> F.Q * F.H * F.Q'
3×3 Array{Float64,2}:
 4.0  9.0  7.0
 4.0  4.0  1.0
 4.0  3.0  2.0

julia> q, h = F; # destructuring via iteration

julia> q == F.Q && h == F.H
true

In [71]:
B=hessenberg(A)


Out[71]:
Hessenberg{Float64,UpperHessenberg{Float64,Array{Float64,2}},Array{Float64,2},Array{Float64,1},Bool}
Q factor:
5×5 LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}:
 1.0   0.0        0.0         0.0         0.0
 0.0  -0.427482   0.541573   -0.495474   -0.527695
 0.0  -0.284988  -0.0196289  -0.584449    0.759484
 0.0   0.569976  -0.421563   -0.642308   -0.291295
 0.0   0.641223   0.727047   -0.0191312   0.244681
H factor:
5×5 UpperHessenberg{Float64,Array{Float64,2}}:
 -2.0      0.071247   5.36946   -0.139112   -3.89159
 14.0357   1.01523   -4.25582   -2.10143     6.9091
   ⋅      10.668      6.73332    3.71058    -2.66098
   ⋅        ⋅        -5.16838  -12.871     -13.8173
   ⋅        ⋅          ⋅        -2.27566     5.12246

In [72]:
B.H


Out[72]:
5×5 UpperHessenberg{Float64,Array{Float64,2}}:
 -2.0      0.071247   5.36946   -0.139112   -3.89159
 14.0357   1.01523   -4.25582   -2.10143     6.9091
   ⋅      10.668      6.73332    3.71058    -2.66098
   ⋅        ⋅        -5.16838  -12.871     -13.8173
   ⋅        ⋅          ⋅        -2.27566     5.12246

In [73]:
B.Q


Out[73]:
5×5 LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}:
 1.0   0.0        0.0         0.0         0.0
 0.0  -0.427482   0.541573   -0.495474   -0.527695
 0.0  -0.284988  -0.0196289  -0.584449    0.759484
 0.0   0.569976  -0.421563   -0.642308   -0.291295
 0.0   0.641223   0.727047   -0.0191312   0.244681

In [74]:
B.Q'*A*B.Q


Out[74]:
5×5 Array{Float64,2}:
 -2.0      0.071247      5.36946   -0.139112   -3.89159
 14.0357   1.01523      -4.25582   -2.10143     6.9091
  0.0     10.668         6.73332    3.71058    -2.66098
  0.0     -1.77636e-15  -5.16838  -12.871     -13.8173
  0.0      1.9984e-15    0.0       -2.27566     5.12246

In [77]:
eigvals(Matrix(B.H))


Out[77]:
5-element Array{Complex{Float64},1}:
 -13.835883548290726 + 0.0im
   -2.19821375895172 - 8.898841182206123im
   -2.19821375895172 + 8.898841182206123im
   6.785962600575748 + 0.0im
   9.446348465618417 + 0.0im

In [ ]: